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Abstract 

Interactive behaviors are ubiquitous in modern cryptography, but are also present in A- 
calculi, in the form of higher-order constructions. Traditionally, however, typed A-calculi 
simply do not fit well into cryptography, being both deterministic and too powerful as for 
the complexity of functions they can express. We study interaction in a A-calculus for proba¬ 
bilistic polynomial time computable functions. In particular, we show how notions of context 
equivalence and context metric can both be characterized by way of traces when defined on 
linear contexts. We then give evidence on how this can be turned into a proof methodology 
for computational indistinguishability, a key notion in modern cryptography. We also hint at 
what happens if a more general notion of a context is used. 


1 Introduction 

Modern cryptography ) 131 is centered around the idea that security of cryptographic constructions 
needs to be defined precisely and, in particular, that crucial aspects are how an adversary interacts 
with the construction, and when he wins this game. The former is usually specified by way of 
an experiment , while the latter is often formulated stipulating that the probability of a favorable 
result for the adversary needs to be small, where being “small” usually means being negligible 
in a security parameter. This framework would however be vacuous if the adversary had access 
to an unlimited amount of resources, or if it were deterministic. As a consequence the adversary 
is usually assumed to work within probabilistic polynomial time (PPT in the following), this 
way giving rise to a robust definition. Summing up, there are three key concepts here, namely 
interaction , probability and complexity. Security as formulated above can often be spelled out 
semantically as the so-called computational indistinguishability between two distributions, the 
first one being the one produced by the construction and the second one modeling an idealized 
construction or a genuinely random object. 

Typed A-calculi as traditionally conceived, do not fit well into this picture. Higher-order types 
clearly allow a certain degree of interaction, but probability and complexity are usually absent: 
reduction is deterministic (or at least confluent), while the expressive power of A-calculi tends to 
be very high. This picture has somehow changed in the last ten years: there have been some 
successful attempts at giving probabilistic A-calculi whose representable functions coincide with 
the ones which can be computed by PPT algorithms dnasi. These calculi invariably took the 
form of restrictions on Godel’s T, endowed with a form of binary probabilistic choice. All this 
has been facilitated by implicit computational complexity, which offers the right idioms to start 
from [10], themselves based on linearity and ramification. The emphasis in all these works were 
either the characterization of probabilistic complexity classes [1] , or more often security PH HU US]: 
one could see A-calculi as a way to specify cryptographic constructions and adversaries for them. 
The crucial idea here is that computational indistinguishability can be formulated as a form of 
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context equivalence. The real challenge, however, is whether all this can be characterized by 
handier notions, which would alleviate the inherently difficult task of dealing with all contexts 
when proving two terms to be equivalent. 

The literature offers many proposals going precisely in this direction: this includes logical 
relations, context lemmas, or coinductive techniques. In applicative bisimulation pQ, as an example, 
terms are modeled as interactive objects. This way, one focuses on how the interpreted program 
interacts with its environment, rather than on its internal evolution. None of them have so far 
been applied to calculi capturing probabilistic polynomial time, and relatively few among them 
handle probabilistic behavior. 

In this paper, we study notions of equivalence and distance in one of these A-calculi, called 
RSLR g], More precisely: 

• After having briefly introduced RSLR and studied its basic metatheoretical properties (Sec¬ 
tion 0), we define linear context equivalence. We then show how the role of contexts can be 
made to play by traces. Finally, a coinductive notion of equivalence in the style of Abramsky’s 
bisimulation is shown to be a congruence, thus included in context equivalence, but not to 
coincide with it. We also hint at how all this can be extended to metrics. This can be found 
in Section g] 

• We then introduce a notion of parametrized context equivalence for RSLR terms, showing that 
it coincides with computational indistinguishability when the compared programs are of base 
type. We then turn our attention to the problem of characterizing the obtained notion of 
equivalence by way of linear tests, giving a positive answer to that by way of a notion of 
parametrized trace metric. A brief discussion about the role of linear contexts in cryptography 
is also given. All this is in Section [5] 

2 Characterizing Probabilistic Polynomial Time 

In this section we introduce RSLR Tj, a A-calculus for probabilistic polynomial time computa¬ 
tion, obtained by extending Hofmann’s SLR ITT] with an operator for binary probabilistic choice. 
Compared to other presentations of the same calculus, we consider a call-by-value reduction but 
elide nonlinear function spaces and pairs. This has the advantage of making the whole theory less 
baroque, without any fundamental loss in expressiveness (see Section 15.31 below). 

First of all, types are defined as follows: 

A::= Str | BA^A | □ A — A. 

The expression Str serves to type strings, and is the only base type. BA —> B is the type of 
functions (from A to B) which can be evaluated in constant time, while for nA —* B the running 
time can be any polynomial. Aspects are the elements of {□, B} and are indeed fundamental to 
ensure polytime soundness. We denote them with metavariables like a or b. We define a partial 
order <: between aspects simply as {(□, □), (□, B), (B, B)}, and a subtyping by using the rules in 
Figure H 


A <: A 

A <: B B <: C 

B <: A 

C <: D 

a <: b 

A <: C 

aA - 

- C <: bB - 

- D 


Figure 1: Subtyping Rules 

The syntactical categories of terms and values are constructed by the following grammar: 
t ::= x | v | 0 (t) | 1 (t) | tail(t) | tt | cas e/^(t,t,t,t) | r ec/\(t,t,t,t) | rand; 
v ::= m Air : aA.t; 
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where to ranges over the set {0,1}* of finite, binary strings, while x ranges over a denumerable set 
of variables X. We write T, V for the sets of terms and values, respectively. The operators 0 and 
1 are constructors for binary strings, while tail is a destructor. The only nonstandard constant is 
rand, which returns 0 or 1, each with probability thus modeling uniform binary choice. The 
terms case A (f, to, ti, t e ) and rec/\(t,to,t\,t e ) are terms for case distinction and recursion, in which 
first argument specifies the term (of base type) which guides the process. Informally, then, we 
have the following rules: 


caseA(e, to,t\, te) * t € , l”CC A (g ^0; G; t e ) * te 

case A (0m ,tn,ti,t f ) —> toi rec A (0m, to, U , t f ) —► (fnOm)(rec A (m,fn,ti ,t e )) 

case A (lro, to, ti,t e ) -* ti, rec A (lro, to,ti,t e ) -> (tilm)(rec A (7?i,to,ii,i e )) 

The expression e stands for the empty string and we set tail(e) —> e. Given a string to, T(m) is 
the set of strings whose tail is to, e.g. T(e) = {e, 0,1}. 

As usual, a typing context T is a finite set of assignments of an aspect and a type to a variable, 
where as usual any variable occurs at most once. Any such assignment is indicated with x : aA. 
The expression T, A stands for the union of the two typing contexts T and A, which are assumed 
to be disjoint. The union T, A is indicated with T;A whenever we want to insist on T to only 
involve the base type Str. Typing judgments are in the form T \- t : A. Typing rules are in Figure[2 
The expression Tp (respectively, Vp) stands for the set of terms (respectively, values) of type A 


x : aA G r 


r |— t : Str 


r |— t : Str 


r |— t : Str 


rti:A F m: Str r b 0 (t) : Str r b 1(f) : Str T |-tail(t) : Str h rand : Str 


T; Ai h t : Str T; A 3 h ti : A 
V ; A2 to : A I ; A4 \— t e : A 

T; Ai, A 2 , A 3 , A 4 I- case A (t, t 0 ,ti,t e ) : A 


r, x : aA h f : B 
r h Ai : aA .t : aA —1 


r h t : A 


A <: B 


r h t : B 


Ifi; Ai |— t : Str 
Ti,r 2 I- t 0 : DStr -* BA 
Ti,r 3 h ti : DStr -* BA 


I’ll T2, r 3 ; A 2 h t e : A 
■A b,Ai <:□ 

■A A is D-free 


r; Ai |— t : aA —> B 
T; A 2 h s : A 


ri,r 2 ,r 3 ; Ai, A 2 h rec A (t,to,ti,t e ) : A 


T; Ai, A 2 b ts 


T, A 2 <: a 


Figure 2: RSLR’s Typing Rules 


under the typing context T. Please observe how the type system we have just introduced enforces 
variables of higher-order type to occur free at most once and outside the scope of a recursion. 
Moreover, the type of terms which serve as step-functions in a recursion are assumed to be D-free, 
and this is precisely what allow this calculus to characterize polytinre functions. 

The operational semantics of RSLR is of course probabilistic: any closed term t evaluates not 
to a single value but to a value distribution , i.e, a function V : V —» R such that XLev V(v) = 
1. Judgments expressing this fact are in the form t JJ. T>, and are derived through a formal 
system whose rules are in Figure [3] In the figure, and in the rest of this paper, we use some 
standard notation on distributions. More specifically, the expression {i'“ 1 ,... ,v% n } stands for the 
distribution assigning probability ai to Vi (for every 1 ^ ^ n). The support of a distribution T> 

is indicated with S (T>). Given a set A', Px is the set of all distributions over A'. Noticeably: 

Lemma 1 For every term t e T ^ there is a unique distribution T> such that t JJ. T>, which we 
denote as [t]. Moreover, If v e S{V), then v e V^. 

Proof. We proceed by induction on the structure of t. 

• If we have a value v , then by the rules it converge to {u 1 }. 
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• Similarly if we have a term rand the only distribution it can converge is {0 2 , l 2 }. 

• Suppose now to have tpt 2 , and suppose t\t 2 (l ~D,ti,t 2 -0- T>'. 

By construction we have: 

V= Yj ^i(A x.t)-V 2 (v)-V t , v V = Yj 

X x.t,v A x.t',v' 

But, by induction hypothesis we have V i = T>\ , T >2 = V 2 and so also V t>v = V t , v , and this 
means V = V 

• All the other cases are similar. 

The second point comes from the fact that, given a term t such that 1 — t : A, if it reduces to 
we have that |— U : A. This is proved by induction on the type derivation. So by 
combinig the fact that the type is preserved by reduction and the uniqueness of V we have that 
for all v e S(V ), (- v : A. □ 

A probabilistic function on { 0 ,1}* is a function F from { 0 ,1}* to P{o,i}* ■ A term t e T 0 tr ^ Str is 
said to compute F iff for every string me {0,1}* it holds that tm {{ V where V(n) = F(m)(n) for 
every n e {0,1}*. What makes RSLR very interesting, however, is that it precisely captures those 
probabilistic functions which can be computed in polynomial time (see, e.g., [5] for a definition): 

Theorem 1 (Polytime Completeness) The set of probabilistic functions which can be com¬ 
puted by RSLR terms coincides with the polytime computable ones. 

This result is well-known ED SI, and can be proved in various ways, e.g. combinatorially or 
categorically. 

We conclude this section by giving two RSLR programs. Both of them receive a string in input. 
The first one returns the same string. The second one, instead, produces a random string and 
compare it to the one received in input; if they are different it returns the same string, otherwise 
it returns the opposite. 

t := Xx : nStr.a: s := Xx : □Str.case$tr(a: = (RBG x),x , ~^x, -■ x ) 


Where: 

RBG := Xy : nStr.recgtr (y, f rbg, ^rbg, c) f rbg := A w : [UStr.Az : ■Str.case Str (rand, 0(z), l(z), e) 
Notice that, even if we haven’t defined = and —■, they are easily implementable in RSLR. 
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We give now a simple example of how the big step semantics of a RSLR term is evaluated; we 
observe the term RBG applied to a string 01. 

[RBG 01] =[[RBG](Aj/.recst r (y, fn.BG, f RBG> e )) • [01](01) • [recstr(Ql, f RBG, f RBG, e)J 
= 1 • 1 • [reCStr(01 L , fRBG, fRBG, e)] = 

= [011 (01) • [(fRBG01)(l'eCStr(l, fRBG, fRBG, £.))] = [(f RBgOI) (^CStr (1 , f RBG, f RBG, i))] 

We can easily say that [f RB GQl] = [fRBo{2y™}] = {(Az.cases tr (rand, 0 ( 2 ), 1 ( 2 ), e)) 1 }. 

Furthermore we have: 

[rec St r(l, f RBG, f RBG, C )1=[I](I)-I(f RBG i)(recstr(e,f RBG,fRBG,C ))] 

So, by the fact that [rec Str (e, fRBG, fRBG, e)] = {e 1 } we have: 

[rec Str (.l,f rbg, f rbg, e)l =[case Str (rand,0(e), 1 (e),e)] = |rand](0) • [0(e)] + [rand](l) ■ [1(e)] = 

So, by substituting we have: 

[RBG 01] =[(fRBG01)(recstr(l, Irbg, fRBG, e))] = 

= \ ' [case St r(rand,0(0),l(0),e)] + \ • [case Str (rand,0(l), l(l),e)] = 

= I.{00dl0^} + i •{01^,11^} = 

={003,013,103,113} 


3 Equivalences 

Intuitively, we can say that two programs are equivalent if no one can distinguish them by observing 
their external, visible, behavior. A formalization of this intuition usually takes the form of context 
equivalence. A context is a term in which the hole [■] occurs at most once. Formally, contexts are 
defined by the following grammar: 

C :■.= t | [■] | A x.C | Ct | tC | | 0 (C) | 1 (C) | tail(C) 

| cas e^(C,t,t,t) | case^ft, C,C,C) | r 

If the grammar above is extended as follows C ::= recA(t, C, t , t) | recA(t, t, C , t ) | recA(t, t, t, C ), 
what we get is a nonlinear context. What the above definition already tells us is that our emphasis 
in this paper will be on linear contexts, which are contexts whose holes he outside the scope of 
any recursion operator. Given a term t we define C[t] as the term obtained by substituting the 
occurrence of [■] in C (if any) with t. We only consider non-binding contexts here, i.e. contexts 
are meant to be filled with closed terms. In other words, the type system from Section [2] can be 
turned into one for contexts whose judgments take the form T h C[l— A] : B, which means that 
for every closed term t of type A, it holds that T b C\t] : B. See Figure [I] for details. Now 
that the notion of a context has been properly defined, one can finally give the central notion of 
equivalence in this paper. 

Definition 1 (Context Equivalence) Given two terms t, s such that b t,s : A, we say that 
t and s are context equivalent iff for every context C such that b CTb A] : Str we have that 
[C[f]](e) = [ C[s]j(e). 

The way we defined it means that context equivalence is a family of relations {=a}aeA indexed by 
types, which we denote as =. If in Definition [T| nonlinear contexts replace contexts, we get a finer 
relation, called nonlinear context equivalence , which we denote as =^. Both context equivalence 
and nonlinear context equivalence are easily proved to be congruences, i.e. compatible equivalence 
relations. 
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3.1 Trace Equivalence 

In this section we introduce a notion of trace equivalence for RSLR. and we show that it charac¬ 
terizes context equivalence. 

We define a trace as a sequence of actions 1% - 12 ■ ■ ■ ■ • l n such that k e {pass(u), view(m) | v e 
V, 22 1 £ V Str }. Traces are indicated with metavariables like T, S. The compatibility of a trace T with 
a type A is defined inductively on the structure of A. If A = Str then the only trace compatible 
with A is T = view(m), with to £ V Str , otherwise, if A = bB —» C then traces compatible with A 
are in the form T = pass(u) ■ S with v £ V B and S is itself compatible with C. With a slight abuse 
of notation, we often assume traces to be compatible to the underlying type. 

Due to the probabilistic nature of our calculus, it is convenient to work with term distributions, 
i.e., distributions whose support is the set of closed terms of a certain type A, instead of plain 

terms. We denote term distributions with metavariables like T, <S,_ The effect traces have to 

distributions can be formalized by giving some binary relations: 

• First of all, we need a binary relation on term distributions, called Intuitively, T ^ S iff T 
evolves to S by performing internal moves, only. Furthermore, we use —> to indicate a single 
internal move. 

• We also need a binary relation =>' between term distributions, which is however labeled by a 
trace, and which models internal and external reduction. 

• Finally, we need a labeled relation >—►' between distributions and real numbers, which captures 
the probability that distributions accept traces. 

The three relations are defined inductively by the rules in Figure [5] The following gives basic, 


r^ s {(a x.ti)p*} 

T^ s s s^u 

T ^ T r=> 5- pa ss(v) {{ti{v/ x })P'} 

T^ S U 

T^ S {(m*)*} 


r- s-ewN z mi = m Pi r+{m =*T+m '■»} 


Figure 5: Term Distribution Small-Step Rules 
easy, results about the relations we have introduced: 


6 


















Lemma 2 Let T be a term distribution for the type A. Then, there is a unique value distribution 
T> such that T T>. As a consequence, for every trace T compatible for A there is a unique real 
number p such that T >—> T p. This real number is denoted as Pr(T, T). 

Proof. Suppose that T is normal, i.e. all elements in the support are values, then we have T = T> 
and then the thesis. 

If T is not normal then there exists a set of indexes J such that T aren’t values. 

We know by a previous lemma that for all j € J there exists a unique T>j , value distribution, such 
that tj 4 T>j in a finite number of steps. 

So, if we set V = 7\{(tj) Pj }j e j + J0j e jPj • T>j we have T V with V normal. 

At this point we can say that for all T there exists T' normal such that T => e T' ■ So, given 
T = S • pass(u) we have by induction hypothesis that T => s {(Xx.ti) Pi }. Then, by performing the 
action pass(u) we have T => s 'P ass v u ) {(t i { v /x}) Pi }, but, by applying the previous point there exists 
T' normal such that {{ti{ v /x}) Pi } T' and then by the small step rules we have T => s 'P ass <v) T~' 
normal distribution. 

Suppose now that T = S • view(m) then we have by induction hypothesis T => s T' = {{m.i) Pi } 
with T' unique; so, if we perform the action view(m) we have T h-^ S view (Hi) p = Yji m =mPi' that 
is unique by construction. □ 

We are now ready to define what we mean by trace equivalence 

Definition 2 Given two term distributions T,S we say that they are trace equivalent (and we 
write T — T S) if, for all traces T it holds that Pr (T, T) = Pr(<S, T). In particular, then, two terms 
t,s are trace equivalent when {t 1 } {s 1 } and we write t ~ T s in that case. 

The following states some basic properties about the reduction relations we have just introduced. 
This will be useful in the following: 

Lemma 3 (Trace Equivalence Properties): Suppose given two term distributions T,S such 
that T — T S. Then: 

• If T ^T' then T' S. 

• ][f T ^ss(v) r , and S ^passiv) S > then r / 

• IfT^ view(l]l) p then S p . 

Proof. The proof is a simple application of the definition of trace equivalence. □ 

It is easy to prove that trace equivalence is an equivalence relation. The next step, then, is 
to prove that trace equivalence is compatible, thus paving the way to a proof of soundness w.r.t. 
context equivalence. Unfortunately, the direct proof of compatibility (i.e., an induction on the 
structure of contexts) simply does not work: the way the operational semantics is specified makes 
it impossible to track how a term behaves in a context. Following [B], we proceed by considering a 
refined semantics, defined not on terms but on pairs whose first component is a context and whose 
second component is a term distribution. Formally, a context pair has the form ( C,T ), where C is 
a context and T is a term distribution. A (context) pair distribution is a distribution over context 
pairs. Such a pair distribution V = {(Ci,77) Pi } is said to be normal if for all i ad for all t in the 
support of Ti we have that Ci[t\ is a value. We show how a pair (C, T) evolves following a trace S 
by giving a one-step reduction relation (denoted with —») and the small-step semantic described 
in the rules in Figure [B] and Figure [7] 

The following tells us that working with context pairs is the same as working with terms as 
far as traces are concerned: 

Lemma 4 Suppose given a context C, a term distribution T, and a trace S. Then if (C,T) => s 
{{C,,T) Pi } then C[f] {(C,[Ti\) Pi }. Moreover, if {C,T) >~* S P, then Pr(U[T], S) = p. 
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(MJ) _>passO) 

{([], T') 1 } 

(Ax.C, T) -h 

.pass(„) {(Cj^}^) 1 } 





7— 1—^view(m) 


(■m,T) M . view (’n) 

1 (mVn 

, > view(m) q 

([.], 7") ^.vlewlrri.) p 

(C, T) t-> view (2i) P 

(C, 77) t-> view l 

m.) p 

(C, T) p b 

(0(C), T) 

, ) view(Om) ^ 

(1(C), T) ^ vie " 

iMn) p 

(tail(C), T) 1—^view(m) 


t - 

-{(ti)*’*} 

T 

-> V 


(t,T) 

->{M)«} 

([•l.'D — 

■{([•], T') 1 } 



(C, T) -^ ass C) {(C',7"') 1 } 



(Cv, T) - 

-UC'.T') 1 } 



(C,T) —> {(Cj,7i) p *} 

i-{(i*) p< } 

(C, 7~) value 


(Ct,T) - {(C<t, 71)P‘} 

(Ct,T)- 

- {(cti,7ir*} 


(C, T) - 

-*■ {(C», 7i) p *} 

t - 

♦{(**)”*} 


(vC, T) - 

- {(^Ci,71)w} 

(tC,T)~ 

- {(tiC,7^)P‘} 





(C, 77 e V 


((A x.t)C,T) ^ mc/^T) 1 } 


(case A (t, Co, Cl, C c ),T) — {(case A (ii, C 0 , Ci, C E ), 71)« } 
_ {C,T)-+ _ 

(case A (C, to,ti,t e ),T) —» {(case A (Ci, t 0 , ti ,te),Tl) Pi } 


(case A (0m, C 0 , Cl, C e ),T) - 

-{(Co.T) 1 } 

(case A (lm, Co, Ci, C e ),T) - 

■> {(Ci, 7 - ) 1 } 


(case A (e,C 0 ,Ci,C E ),7") - {(C^T) 1 } 
m e V Str {(C, 71 ^ view (^ p^} 


(case A (C, to,ti,t e ), T) {(t 0 , -fan Pon, (ti, ■) S ^ P -, (C, 0 P -, } 

_ (C,T) -> {(Ci,T,)^} _ 

(rec A (C,t 0 ,t 1 ,t e ),T) -» {(rec A (C;, t 0 , ti , t E )) Pi , Ti ) Pi } 

(C, T) ^ view ^> 

. , . \N {((‘o22)rec A (ri,t o ,fi,t ( ,),r) f ’^}m=0n + 

(rec A (G,t 0 ,ti,t e p {(( tl m)rec A (n,t 0 ,t 1 ,t e ),T) !, 2!-} I: , = i n + {(i e ,r) i, i} 


Figure 6: One-step Rules 



p => s p' p' ^ p" 

75 => E p 

P => S P" 

V=> s {(Ci,71)w} 

(Ci,71) - pass « {(CIV) 1 } 

-p ^S-passM {(C', 7 )') Pi } 

F {(Cj,71) Pi } (Cf,71) 

'Pi (C,T) — > {(Cj,7i) Pi } 

p^S-view(m)2. pi . p ' 

F + {(C, 7T1 => P + {(C; , 7 )) p ' p *} 


Figure 7: Small-Step Rules 

Proof. • The first case comes from the definition of 1-step and small-step semantics. 

• If S = S' • view(m) with S' incomplete trace, by the previous point we have that C\T] => s 
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{{Ci[Ti\) Pi } and (C,T) =^ s ' {{Ci,Ti) Pi }. So we have 


Pr(C[n,S)=X>i'(a[7^ 
= X>< ■ ((Cull) 


1 , 

view(rn)) = ^ 

T(m), 


( _^view 


(m)\ 


(c.r) - s 


if Ci = to; 

if Ci = m' / to; = 

if £=[•]■ 


□ 


But how could we exploit context pairs for our purposes? The key idea can be informally explained 
as follows: there is a notion of “relatedness” for pair distributions which not only is stricter than 
trace equivalence, but can be proved to be preserved along reduction, even when interaction with 
the environment is taken into account. 

Definition 3 (Trace Relatedness) Let V, Q be two pair distributions. We say that they are 
trace-related, and we write WQ if there exist families {C)}i 6 j, {7i}«=/, and {pi}iei such 

that V = {( Ci,Ti ) Pi }, Q = {(Ci, Si) Pi j and for every is I, it holds that % — T Si. 

The first observation about trace relatedness has to do with stability with respect to internal 
reduction: 

Lemma 5 (Internal Stability) Let V, Q be two pair distributions such that WQ then, if there 
exists V such that V => e V, then there exists Q! such that Q => e Qf and V'VQ!. 

Proof. By definition of V for all ( C,T) P e V there exists (C, S) p e Q such that T — T S. 

If V => e V' then we have either V' = V or V ^ V'; if V = V then we choose Q! = Q and we get 
the thesis. 

If V ^ V' then we have that there exists a term (C. T) £ S(V) that reduces; we face two possible 
cases: 

• The first case is a term distribution reduction, i.e. (C, T) —> {(C, T') 1 }. 

By the small step rules we know that V = V\{(C, T) p } + {(C, T ') p }, but, given (■ C , S) p £ Q 
with T — T S by a previous lemma we know L~' S and then if we set Qf = Q we have the 
thesis. 


• The second case is a context reduction, i.e. (C,T) —* {(Ci, %) pi }■ 

We focus our attention on one particular reduction. 

Suppose that the pair that reduces is (caseA(C, to, t\, t e ), T) p £ V, with (C,T) value; we 
know that there exists (case^(C,to,ti,t e ),S) p £ Q such that T — T S. 

If C = to by the one-step rules we have: 


and similarly: 


So we set: 


(case A (TO, t 0 , ti, t € ), T) 


(case A (m, t 0 ,ti,t e ),S) 


{(to^T) 1 }, 

If 221 = On: 

{(ti,m, 

If 221 = In: 

{(c,m, 

If 2H = e. 

{(to,S)'}, 

if 22i = On; 

{(h .S) 1 }, 

If 221 = In: 

{(^,-s) 1 }, 

If 221 = £■ 


V' = V\{(case A (rn,t 0 ,t 1 ,t e ),T) p } + {(t' ,T) P } 
Q! = Q\{(case A (m, to,ti,t e ), S) p } + {(t',5) p } 
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where t' is one between to,ti,t e depending on m, and we easily get the thesis. 
If C = [•] then by the one-step rules we have: 


([•],T) 


r f T(M 0 ) ,T( Mi) ,T(e) 
\ L 0 5^1 j L e 


([•],£) 


r,<S(M 0 ) ,«S(Mi) ,5(e) 
l^O > L 1 i L e 


with M 0 = {Om} MeV Str iMi = {idl}mev Str - 
But we know, for all M: 

T(M) = ^ T(m) = Y, *5(221) = *5(M) 

me M meM 


So we have 

V = n{(case A ([-], to, h,t e ), TY) + {(to, T) p T(Mo) , (ti, T) p ' r(Ml} , (t e , TY' T(l) } 

and if we set 

Q 1 = Q\{(c a se A ([YtoM,Q,SY} + {^SY'^Kih.SY'^Ate.SY 5 ^} 

we obtain the thesis. 

The recursive case rec A (C, to, ti, t e ), with (C, 7") value, is similar. 

On the other cases, if (C, T) — > {(Ci,7{) Pi } by clehnition of V we know that there must 
exist (C, S) p e Q such that (0,5) —> {(Ci,£;) Pi } (That is a reduction to the same contexts 
Ci with the same probability pi ), so we have to prove that 71 Si for all i. This is true 
because either the two term distributions remain the same, i.e. 7 1 = T, 5 ,; = S for all i., or 
the context passes the same value to the two term distributions and so by a previous lemma 
% — T Si for all i. 

Now if we set V = P\{(C,T) P + {(O;, Ti) PPi }} and Q' = Q\{(0,5)p} + {(<7*, S ,) p ' Pi } then 
we have V ^ V , Q ^ Q! and V'vQ'. 

□ 


Once Internal Stability is proved, and since the relation ^ can be proved to be strongly normalizing 
also for context pair distributions, one gets that: 

Lemma 6 (Bisimulation, Internally) If V, Q are pair distributions, with WQ then there are 
V ', Q! normal distributions such that V => e V', Q => e Q! and V'vQ!. 

Proof. The proof comes from the fact that, given V if it is not normal, there is V' normal such 
that V ^* V, and by the previous lemma we have V'VQ. Then if Q isn’t normal we can repeat 
the procedure and get Q! such that Q^* Q! and V'VQ!. □ 

The next step consists in proving that context pair distributions which are trace related are 
not only bisimilar as for internal reduction, but also for external reduction: 

Lemma 7 (Bisimulation, Externally) Given two pair distributions V , Q with WQ, then for 
all traces S we have: 

1. If V => s Ad, with Ad normal distribution, then Q => s A f, where AdVA/" and TV" is a normal 
distribution too. 

2. If V i—> s p then Q >-> s p. 


Proof. We act by induction on the length of S. 

If S = e then by lemma|H]we get the thesis. Suppose now S = S' • pass(u) then we have by induction 
hypothesis: V ^ s ' {(Ci, Tf) Pi }«=/ and Q =^ s ' {(Ci, Si) Pi }izi with % — T Si for all* e 7 and the two 
pair distribution normal. 

But, by the one-step rules we have only two possible derivation for an action pass(r>): 


(A x.C,T) ^P ass M {(£{«/*}, 7") 1 } 


y _ > pass(p) yr 

([■],T)- Pass(u) {([•], T') 1 } 
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So if we set J = {j e I \ Cj = Xx.Cj}, K = {k e / | Cj = [•]} we have: 

{ ( Ax .c',^)«} + {([•], %r} Q => s< {(Xx.C',Sj) p *} + {([•], S k y*} 

At this point, if 71- ^-P 355 ^) T k and Sk -* pass ( v '> S' k we know Tf S' k for all k, so by using the 
one step rule we set: 

v = {{c j {«/*),T j y*} + {([■],7 m q' = + {([.], s^ri 


and we have V =^> s, P ass ( w ) pf Q = ^ > S pass(v) q/ w jth r p"\/Q"^ anc j so by applying lemma [G] we get 
the (1) thesis. 


Suppose now S = S' • view(m). 

By induction we know that V => s ' {( Ci,Ti ) Pi }, Q => s ' {{Ci,Si) Pi } with % — T Si for all i 6 / and 
that the two pair distributions are normal. 

So we set J = {j e I \ Cj = mj}, K = {k e I \ Ck = [■]} and know: 

V^ s ' {(m l ,T J ) p *} + {{[■],T k ) Pk } Q^ 5 ' {(m l ,S j )*} + {([•], S k D 

So we have: 

P M s viewQm ^ ^Pfc • Pr(7fc, view(m)) g_s-view(m) ^ p , + ^ Pk . p r (<S fc , view(m)) 

rrij =m rrij =m 


But Pr(7fe, view(m)) = Pr(<Sfc, view(m)) and so we get the thesis (2). □ 

Lemma 8 Given two terms distributions T,S such that T — T S, then for all context C, for all 
trace S we have: Pr(C[7~],S) = Pr(C[<Sj,S) 

Proof. If the trace S doesn’t end with the action view(-) then Pr(C[T], S) = 1 = Pr(C[<S], S). 
Otherwise we know that (C, T) >—> s p , we can write Pr((C, T), S) = p , and by Lemma [3 we know 
(C,S) f-> s p. But by Lemmaiwe know Pr(C[7"], S) = Pr((C, T),S) = Pr((C, S), S) = Pr(C[<S],S) 
and then the thesis. □ 

We are now in a position to prove the main result of this section: 

Theorem 2 Trace equivalence is a congruence. 

Proof. We have to prove that, given two terms t, s such that t s then for all contexts C, we 
have that C[t] C[s], i.e., for all traces S we have Pr(C[f],S) = Pr(C[s],S). But by Lemma0] 
and Lemma|7]we have, indeed, that Pr(C[t],S) = Pr((C, {t 1 }), S) = Pr((C, {s 1 }, S) = Pr((7[s],S), 
because the two pair distributions {(C, {t 1 }) 1 } and {((C, {s 1 })) 1 } are trace-related. □ 

Corollary 1 (Soundness) Trace equivalence is included into context equivalence. 

Proof. If t s, then by the previous theorem we have that for all contexts C we have 
C\t\ C[s] and this means that if we choose a trace T = view(e) then we have [C[t]](e) = 
Pr(C[t], view(e)) = Pr(C[s], view(e)) = |C[s]]|(e), and so the thesis. □ 

Theorem 3 (Full Abstraction) Context equivalence coincides with trace equivalence 

Proof. For any admissible trace T for A, there is a context CY[] such that Pr(t, T) = [Cr[t]](e), 
which can be proved by induction on the structure of A. □ 


11 


3.2 Some Words on Applicative Bisimulation 

As we already discussed, the quantification over all contexts makes the task of proving two terms 
to be context equivalent burdensome, even if we restrict to linear contexts. And we cannot say 
that trace equivalence really overcomes this problem: there is a universal quantification anyway, 
even if contexts are replaced by objects (i.e. traces) having a simpler structure. It is thus natural 
to look for other techniques. The interactive view provided by traces suggests the possibility to go 
for coinductive techniques akin to Abramsky’s applicative bisimulation, which has already been 
shown to be adaptable to probabilistic A-calculi El 12- 

First of all, we introduce a Labeled Transition System, by defining a Labeled Markov chain 
M = (S, \-,V) where S = T w V is the set of states, L = {eval, pass(-), view(-)} is the set of labels, 
A is the set of types and V is the probability measure defined as follows: 

V : (S,A) x L x (S, A) -> [0,1] 

V{(t, A), eval, (v , A)) = [7](u) "P((A x.t, aA —» B), pass(u), ( t{ v /x }, B)) = 1 

V({m, Str), view(m), (to, Str)) = 1 V{{rn, Str), view^m,'), (to, Str)) = 0 

So, before giving the definition of bisinrulation we define a typed relation as a family IZ = (7?.p)A,r, 
where each IZ p is a binary relation on Tp; we define the open extension 1Z 0 by saying that, given 
two terms t,s we have t!Z 0 s iff for all T-closure £ we have that b (f£)7^.(s£) : A. 

Definition 4 Given a Labeled Markov Chain A4 = (S, L, V) a probabilistic applicative bisimu¬ 
lation is an equivalence relation 7 Z between the states of the Markov chain such that, given two 
states t , s we have (t 7Z s) : A if and only if for each equivalence class E modulo 7 Z we have: 

V((t,A),l,E)=V((s,A),l,E) 

We define ~ as the reflexive and transitive closure of \J{IZ \ 1Z bisimulation}. We say that two 
terms t,s e T][ are bisimilar (We write T b t ~ s : A) if there exists a bisimulation between them 
and we define ~ 0 as the bisimulation equivalence. 

Definition 5 A probabilistic applicative bisimulation is defined to be any type-indexed family of 
relations {1Za}a£\ such that for each A, 7 Za is an equivalence relation over the set of closed terms 
of type A. and moreover the following holds: 

• If OZas, then for every equivalence relation E modulo IZa, it holds that [7] (E) = |s]|(i?). 

• If (Xx.t)TZ a A^B{^x.s), then for every closed value v of type A, it holds that ( t{ v /x})IZB{s{ v /x }). 

• U rnTZstrU. then to = n. 

With some effort, one can prove that a greatest applicative bisimulation exists, and that it consists 
of the union (at any type) of all bisimulation relations. This is denoted as ~ and said to be 
(applicative) bisimilarity. One can then generalize ~ to a relation ~ 0 on open terms by the usual 
open extension. 

One way to show that bisimilarity is included in context equivalence consists in proving that 
~o is a congruence; to reach this goal we first lift ~ 0 to another relation by the so-called 
Howe’s method [12;, and then transitive close it, obtaining another relation (~^) + . This can be 
done by the rules in Figure[8] By construction, the relation (~^) + is a congruence. But one can 
also show that it coincides with ~ D , namely that (~^) + b~ 0 and ~ 0 <= (~^) + . The first inclusion 
is again an easy consequence of the way (~?) + is defined, and of the fact that ~ is an equivalence 
relation. The second one is more difficult, and needs some intermediary steps to get proved. The 
first step is given by the following lemma. 

Lemma 9 (Key Lemma) Given two terms t, s, we have: 

• If b t s : oA —> B. then for all E e T®. 3 a/ equivalence class modulo it holds that 

|7](a.t.f;) = [sKAz.f;). 

• If b t s : Str, then for all m e V Str we have [f](m) = |s]|(m). 
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Figure 8: Howe’s Lifting and Transitive Closure Rules 

Proof. We work by induction on the derivation of I— t s. 

• Suppose t = to. then we have \- to s : Str that is derived from Ho: 

|— TO ~ 0 S 

h m s 

So we have, for all to' e V Str , by definition of ~ 0 : 

IJto(to') = Is] (to') 

• Suppose t = A x.t' then we have b A x.t' s : aA —>• B, derived from H 2 : 

t'{x) s'{x} Ax.s' ~ 0 s 


So, given E e T^. aA /^« we have: 

lAxi'KAxX) . { J; - { J; otheJwfae. = I^K^) - W(Ax.E) 


Suppose now f = case a A->B(i\ t' 0 , t' x , t' e ), then: b casegA-^Bl^, t^t^, ~o s : aA —> B, which 
is derived from H 4 : 


f s' 


t' s' 


t[ si case aA -»B(s', Sq, s", s e ) ~ s 
case aA ^B (t',to,t'i,t' e ) s 
Then for all E e T®. aA /^n we have by induction hypothesis: 

[f](Ax..E) =[case aA ->B(i',io,i'i,^)](Ax.F;) = 

= MU)[<](A®.S) +EP'](0m)I4](£) +EM(i^)K]( E ) = 

m m 

= Is , l(e)[s'J(Ax.£;) +2l s 'KM)Iso](-E ; ) + = Is](Ax.£) 


If t = casestr(i', t' 0 , t^, t' e ) : Str the proof is similar to the previous case. 
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• Suppose now t = tit 2 : aA — *• B and so we have b t\t 2 ~o s : aA — > B that is derived from: 

11 ~o si 

t 2 ~o s 2 S1S2 ~ r 
tih ~o r 

We have to face two different cases: t 2 e T Str and t 2 e T cC ^ D . If <2 e T Str then for all 
E e T®. aA /^H we have: 


[f](A x.E) =lht 2 j(Xx.E) = 2 It 2 J(m') ^ ^ [tiJ(Ay.r)|r{™'/i/}]](Ax.£) = 


.'eV s,r 


^rSV^ B reE r 


= M(rn') £ [trKAy.E^I^IW/.JKAx.E) = 


.'eV St ' 




= X! IMG™') X! [si](Ay.£' r -)[-E’ r .{^'A}](Ax.£’) = [sis 2 ](Ax.F;) = 


m'eV Str 

= [s](Ax.£’) 

If t 2 £ T cC ^ D then we have: 

[i](Ax..E) = [fit 2 ](Ax..E) = 


B reV^Sf 


= £ S [iaKA^.v) ( ^ ^ lt 1 j(\y.r)lr{^v/ y }j(Xx.E) ] = 

E v €~V°, r V€E v 




= ^ [t 2 ](A z.E v ) £ [t 1 ](Xy.E r )[E r {^/ v }l(Xx.E) = 


£„eV° 


<E r eVl^l D 


= £ [ S 2 ](A 2 .^) 2 l Sl }(Xy.E r )lE r {^/ y }j(X X .E) ] = 

^V° cC Wv££ D 

= |sis 2 ](Ax.S) = {sj(Xx.E) 


• The case t = t\t 2 : Str is similar to the previous one. 


• Finally, if t 
from: 


reCaA^B^, to, t'i, t') then we have b rec a A^B(t / , t' Q , t' 1; t') s which is derived 


t'l ~o si rec aA ^B(s',So,si,s') ~ s 

rec a A->B (^, to! ti, te) ~o s 
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then for all E e V!?.. we have: 


[t](Ax.E) = lrec a A->B{t',t' 0 ,t' 1 ,t' e )l(\x.E) = 

= M(e)[ie](Ax.E) + Yi [^](OTO)I(ioM)(i'ec aA ^B(m,t , 0 ,ti,t e ))](Ax.E) 
meV Str 

+ Y Iil(M)I(iilm)( rec aA^B(m,t' 1 ,ti,t e ))](Ax.E) = 
meV Str 

= Is'I(e)[Se](Aa;-E7) + Y M(frn)[(4M)(rec aA ^B(m,s'i> s e ))}(Xx.E) 

meV Str 

+ Y [ s, Kl^)K s il^)( rec aA-»B(TO,si,si,s e ))](Ax.E) = |s]|(Ax.E) 
meV Str 


• The case t = recstr{t', is similar to the previous one. 

So we have the thesis. 

□ 

Theorem 4 (~^) + is a bisimulation. 

Proof. We work by induction on the derivation of (~^) + , proving that (~^) + satisfies the three 
points of the definition above. This, in particular, relies on the Key Lemma. □ 

Theorem 5 Bisimilarity is a congruence. 

Proof. The proof comes easily from the fact that (~^) + is a congruence. Indeed it is transitive 
and symmetric by definition and also compatible. By the definition of (~?) + we have that 
(~?) + ; but the theorem above tells us that (~^) + is a bisimulation, and that it must 
be included in ~ 0 , the symmetric and transitive closure of all the bisimulations. So we have 
~o^(~?) + a (~^) + c~ D which means that ~ 0 = (~?) + , and we get the thesis, namely that 
~o is a congruence. □ 

As usual, being a congruence has soundness as an easy corollary: 

Corollary 2 (Soundness) Bisimilarity is included in context equivalence. 

Is there any hope to get full abstraction? The answer is negative: applicative bisimilarity is too 
strong to match context equivalence. A counterexample to that can be built easily following the 
analogous one from [3J. Consider the following two terms: 

t = Ax. if rand then true else false; s = if rand then (Ax. true) else (Ax. false); 

where we have used some easy syntactic sugar. It is easy to show that t and s are trace equivalent, 
thus context equivalent. On the other hand, t and s cannot be bisimilar. 

This, however, is not the end of the story on coinductive methodologies for context equiva¬ 
lence in RSLR. A different route, suggested by trace equivalence, consists in taking the naturally 
definable (deterministic) labeled transition system of term distributions and ordinary bisimilarity 
over it. What one obtains this way is a precise characterization of context equivalence. There is 
a price to pay however, since one is forced to reason on distributions rather than terms. 


4 From Equivalences to Metrics 

The notion of observation on top of which context equivalence is defined is the probability of 
evaluating to the empty string, and is thus quantitative in nature. This suggests the possibility of 
generalizing context equivalence into a notion of distance between terms: 
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Definition 6 (Context Distance) For every type A, we define 8 A : T 0 x T 0 "*■ »[o,i] as 
S c A (t,s) = sup | _ c , [M]:Str |[C'[t]](e) - [C[s]](e)|. 

For every type A, the function is a pseudometri$\ on the space of closed terms. Obviously, 
S A (t, s) = 0 iff t and s are context equivalent. As such, then, the context distance can be seen as 
a natural generalization of context equivalence, where a real number between 0 and 1 is assigned 
to each pair of terms and is meant to be a measure of how different the two terms are in terms of 
their behavior. S c refers to the family {^}a e a- 

One may wonder whether 5 C , as we have defined it, can somehow be characterized by a trace- 
based notion of metric, similarly to what have been done in Section[3]for equivalences. First of all, 
let us define such a distance. Actually, the very notion of a trace needs to be slightly modified: 
in the action view(-), instead of observing a single string to, we need to be able to observe the 
action on a finite string set M. The probability of accepting a trace in a term will be modified 
accordingly: Pr(f,view(M)) = |f](M). 

Definition 7 (Trace Distance) For every type A, we define 6 A : x —> R[o,i] as S A (t , s ) = 

sup x |Pr(f,T) - Pr(s, T)|. 

It is easy to realize that if t ~ T s then S A (t,s) = 0. Moreover, is itself a pseudometric. As 
usual, d T denotes the family {(S^Jaea- 

But how should we proceed if we want to prove the two just introduced notions of distance 
to coincide? Could we proceed more or less like in Section 13.11 ' The answer is positive, but 
of course something can be found which plays the role of compatibility, since the latter is a 
property of equivalences and not of metrics. The way out is relatively simple: what corresponds to 
compatibility in metrics is non-expansiveness (see, e.g., El)- A notion of distance 8 is said to be non- 
expansive iff for every pair of terms t, s and for every context C , it holds that 8(C[t\, C[s]) < 8(t, s), 
that is a pseudometric too. 

Now we show some properties of the trace distance <5 T applied on term distributions. 

Lemma 10 (Trace Distance Properties): Given two term distributions T, S such that <5 T (T, S) = 
d then we have: 

1. If T ^ T' then S T {V,S) = d. 

2. Ifr^ ass ^ V, S 5' then 8 T (V,S') ^ d. 

3. //r^ Wew ( M ) Pi, S ^ view (M) P2 then \ pi - p 2 | d. 

Proof. 1. Suppose T = {(i;) Pi }, then we have that there exists S 7- 9 f —> {{t'j ) Pj }! we set 
T' = 7~\{(t') p } + By the small step rules we have T => e T', so we have for every 

trace S: 

Pr(T, S) = Pr(T, e ■ S) = Pr(T ', S) 

So if for all traces S, Pr(T, S) = Pi^T'jS) then we have the thesis, d T (T, S) = 8 T (T',S). 

2. It comes from the fact the the quantification is over a smaller set of traces, so the distance 
can’t be greater. 

3. It comes from the fact that the quantification catches the trace view(M). 

□ 


Definition 8 Given two pair distributions P, Q, we say that they are d-related, we write V^dQ, 
if there exist {Ci]i € i contexts, {Ti}iei, term distributions, {pi },{%}, {?y} probabilities with 

tpi = 2 di = Z r i = 1 such that: V = {{Ci,Ti) Pi }, Q = {(C),5,) 9i }, with: 


8 T (Ti, Si) ^ d for all i a 


Pi=qi=Ti , If Ci A t; 
\Pi — Qi\ < n ■ d, IfCi=U. 


1 Following the literature on the subject, this stands for any function 5 : A x A —> R such that S(x,y) = S(y, x), 
S(x, x) = 0 and <5(ic, y) + 5(y, z) ^ <5(ic, z) 
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Lemma 11 (Internal d-stability) Given two pair distributions V, Q with WdQ then if there 
exists V such that V ^ V then there exists Q! such that Q => e Q' or Q ^ Q' and V'VdQ!■ 

Proof. The pair distribution V can reduce to V' in two different ways: we could have a term 
distribution reduction, i.e. S -p 3 ( C,T ) —> {(C, T') 1 }, or a context reduction, i.e. S -p 3 (C, T) —*• 
{{Ci,Ti) Pi }, so, let’s prove the statement for the two cases: 

1. Term distribution reduction. 

Suppose that the term in V that reduces is (C, T) p ; by definition there exists Q B ( C,S) q 
such that 5 T (T, S) < d and p = q = r if C / t, \p — q\ ^ r ■ d otherwise. 

If T ^ T 1 , by the small step rules we have V ^ V = V\{(C, T) p } + {(C, T') p }; so if we set 
Q! = Q we have Q => e Q! and obviously V'VdQ! • 

2. Context reduction 

When we face a context reduction we have to work on different cases: 

(a) Suppose that the pair that reduces is V B (case/\(C,to,ti,t e ),T) p with C e V Str . 

If C = to then there exists Q 3 (caseA( to, to, ti,t € ), S) q with 5 T (T, S) < d and \p — q\ =% 
r • d. 

By the one-step rules we have: 

r u^m, 

(case A (rn,t 0 ,ti,t e ),7~) —► < {(t 0 ,7 - ) 1 }, 

and the same for 

(case A (rn,t 0 ,ti,t e ),<S) —► < {(^i,^) 1 }, 

I {(te,sy}, 

So we have 

V = V\{(case/\(rn, to, ti,t e ),T) p } + {{t',T) P } 

and if we set 

Q! = Q\{(case A (to, to,ti,t e ),S) q } + {(t',5) 9 } 
where t' is one between to,ti,t e depending on m, then we get the thesis. 


If m = On; 
If m = In; 
If m = e. 


If m = On; 
If m = In; 
If m = e. 


IfC = [•] there exists Q 3 (case A ([-], f 0 , ti, t e ),<S) 9 such that 5 T (T,S) < dandp = q = r. 
By the one step rules we have: 


with Mo = {Qm} me v Str ) Mi = { bn l^-ust,-. 
So we have 


([•],£) 


|^‘S(Mo) ^.5(Mi) ^S(e) j 


V = P\{(case A ([-], to, h,t e ), T) p } + {(t„, T) pT{Mo) , (h, 7T r(Ml) , (t e , T) p T ^} 


and if we set 


Q' = Q\{(case A ([-],to,l,^),*5)n + {(io,5)^ M “),(t 1 ,5)^ Ml ),(f e , l S)^} 

we know |p-T(M) — g->S(M)| = r- |T(M) — <S(M)| < r-d, for all M, and then the thesis. 

(b) The case (rec A (C, to,ti,t e ),T) with ( C,T) e V Str is similar. 

(c) If the pair that reduces is V 3 (t,T) p then we have that there exists Q 3 ( t,S) q with 
\p — q\ ^ r ■ d. So if t —► {{ti) Pi } we have V = V\{(t,T) p } + {{ti,T) p ' Pi }\ if we set 
Q = Q\{{t, S) q } + {(ti,S) qpi } and n = r ■ p t we have Yi r i = C I Pi ■ P ~ Pi • q\ < n ■ d, 
so we get the thesis. 
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(d) If the pair that reduces is V 3 ( Cv,T) p with C ¥= A x.C, then we have by the one-step 
rules ( Cv,T) —*• {(C , 7 - ') 1 }. By definition of Vd there exists Q 3 ( Cv,S) q with p = q 
and (Cv,S) —> {(C", <S / ) 1 }; so we have: 

v =*■ v = v\{((cv,ry)} + {(c\ ry } 

Q^Q' = Qmcv,sy)} + {(C,sy} 

but by a previous lemma 6 T (T',S') ^ d and so we have the thesis. 

(e) Otherwise, if we have another pair that reduces S('P) 3 (C, T) —» {{Ct,Ti) Pi } we have 
V = V\{(C,T) P } + {{Culiy-py if we set Q’ = Q\{(C7,S)®} + {((?<, Sf)™} we get the 
thesis. Indeed by a previous lemma we know d T (7j,iS,) ^ d and by definition p = q so 
p ■ Pi = q ■ Pi for all i. 


□ 

Lemma 12 (d-Relatedness, internally) Given two pair distributions V , Q with VVdQ then 
there exist V ', Q' normal pair distributions with V => e V or V ^ V' and Q => e Q! or Q ^ Q! 
such that V'VdQ!■ 

Lemma 13 (d-Relatedness, externally) Given two pair distributions VVdQ then for every 
trace S: 

1. If S doesn’t end with the action view(-) then there exist V ', Q! normal pair distributions such 
that V => s Q! and Q => s Q' with V'VdQ'■ 

2. Otherwise if V >—> s pi and Q >—> s P 2 we have | Pi — P 2 I =% d. 

Proof. We act by induction on the length of S first by proving the first case and then the second 
one. 

1. If S = e then we get the thesis by lemma fl2l 

If S = S' • pass(u) then by the small-step rules we have V => s V', Q Q! with V' = 
Q! = {{Ci,Si) qi } normal pair distributions and by induction hypothesis we have 

V'V d Q'. 

By the one-step rules we know that the action pass(u) is allowed only if C = Xx.C' or C = [■] 
and T = {( Xx.th ) ah }■ So if we set J = {j e I \ Cj = A x.Ct}, K = {k e I \ Ck = [•]}, by the 
small-step rules we have: 

v ^ passW v = {(c j {izW} + m k xr k } (i) 

q' ^ pass(t,) q" = {(c'm,s j )v} + {([•]*,( 2 ) 

but by a previous lemma 5 T (Tf,S' k ) ^ d so we have V'VdQ" and by applying lemma Owe 
get the thesis. 

2. If S = S' • view(M) then we have by induction hypothesis V => s V = {(Cj, Ti) Pi }, Q => s 
Q! = {(C,;, <Sj) 9i } with V , Q! normal pair distributions and V'VdQ'■ 

By the small-step rules we know that 

V - view(M) p = 2^-ft 

with (Ct,Ti) p\ and V{{Ci,Ti)) = Pi . 

Similarly 

g /^view(M) q= ^.. q . 

with (Ci,Si) q[ and Q((Cj,<S i )) = qi . 

At this point we make a distinction: by the one-step rules we know that the action view(M) 
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is allowed only if C, = nii or Cj = [•] and the term distribution in the pair is a string 
distribution, so we set: J = {j G I ; Cj = rrij},K = [k e I ; C& = [•]}. Obviously we have 
I = J + K. 

By the 1-step rules we have: 


-» view(M) 


and 


1, If rrij G M; 
0, Otherwise. 


{[■],%) ^ view(M) T*(M), ([-],5 fc ) 




_^view(M) 


view(M) Sk ( M j 


1, If rrij g M; 
0, Otherwise. 


So we have: 

P =J]pj -Pj +Y.]Pk-Tk(M) q = ^ Qj ' Qj + XI qk ' 5 fc( M ) 

j k j k 

and then: 

\P ~ Q\ = 1 Yj p j ' P'i + YjP k 'Pk~Yi ft ' q j~Yi qk ' q ' k \ = 

j k j k 

=i YjPj ■ p'j - ft ■ q j + Yj pk ■ p ' k ~ qk ' q k\ 

j k 

But, by the fact that T'VdQ! we can say: 

(a) p'j = q'j, by the fact that Cj = rrij is the same for V' and Q', so we call both r'. 

(b) | pj — qj | =% rj ■ d. Furthermore r' • \pj — qj\ =% ft • d because r' G {0,1}. 

(c) Pk = qk = He for all k, because Ck = [•] ¥= t. 

(d) |7fe(M) - <Sfc(M)| < 6 T (%,S k ) ^ d for all k. 

Therefore: 

\P ~ q \ =12 r i ‘ _ ^0 + 2 rk ‘ (7fc( M ) - S fc (M))| < 

j fe 

<12r' • - <&)| + | £ r* • (r fc (M) - <S fe (M))| < 

j k 

- ft i+Xl rfe ■ i^( M ) - >Sfc( M )i < 2 ft ■ d = 

j k j k 

= Y i n-d = d-J]n = d 

i i 

and so the thesis. 

□ 

Theorem 1 (Non-expansiveness) Given two term distributions such that 5 T (T,S) = d then 
for all contexts C we have that <5 T (C[7~], C[<S]) =% d. 

Proof. In order to get the thesis we have to prove that for all traces S we have that | Pr(C[7~], S) — 
Pr(C[<S],S) | < d. 

• If S doesn’t end with the view(-) action then we have: | Pr(C[7~],S) — Pr(C[<S],S) | = 

| 1 - 1 | = 0 < d 
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• Otherwise we have that (C, T) >—> s pi = Pr(C[T], S) and similarly (C, S) »-» s P 2 = Pr(C[<S], S). 
But it is clear that {(C, T) 1 }V ( j{(C', S) 1 }, and so by lemma fl3l we have: \ pi — P 2 | < d and 
then the thesis. 


□ 


Theorem 2 For all t, s,S c (t , s) <5 T (f, s). 

Proof. By the previous theorem we know that, if 6 T (t,s) = d then for all context C, we have 
$ T ((7[f], C[s]) < d; so: 

S c (t,s) — sup | IO[t]l(e) - [OfslKe) | = sup | Pr(C[t], view(e)) - Pr(C[s], view(e)) | s? 

G C 

sSsup 5 T (C[t],C[s])^d VC 

c 


□ 


As a corollary of non-expansiveness, one gets that: 

Theorem 6 (Full Abstraction) For all t,s, S T (t,s) = 5 c (t,s). 

Proof. 1 5 T (f, s) =% 5 c (t,s ) because by the full abstraction lemma for all traces T there exists 
a context Cj such that [[Cj [^]] (e.) = Pr(t, T) and so the quantification over contexts catches 
the quantification over traces. The other inclusion, S c (t,s) =% S T (t,s), is a consequence of non- 
expansiveness. □ 

One may wonder whether a coinductive notion of distance, sort of a metric analogue to applicative 
bisimilarity, can be defined. The answer is positive [7|. It however suffers from the same problems 
applicative bisimilarity has: in particular, it is not fully abstract. 


5 Computational Indistinguishability 

In this section we show how our notions of equivalence and distance relate to computational 
indistinguishability (Cl in the following), a key notion in modern cryptography. 

Definition 9 Two distribution ensembles { D n } ne ^ and {FVjJngN (where both D n and E n are distri¬ 
butions on binary strings) are said to be computationally indistinguishable iff for every PPT algo¬ 
rithm A the following quantity is a negligible function ofne N: |Pr x< _D n {A(x, l n ) = e) — Pr {A{x, l n ) = e)|. 

It is a well-known fact in cryptography that in the definition above, A can be assumed to sample 
from x just once without altering the definition itself, provided the two involved ensembles are 
efficiently computable (0, Theorem 3.2.6, page 108). This is in contrast to the case of arbitrary 
ensembles [£]• 

The careful reader should have already spotted the similarity between Cl and the notion of 
context distance as given in Section [l] There are some key differences, though: 

1. While context distance is an absolute notion of distance, Cl depends on a parameter n, the 
so-called security parameter. 

2. In computational indistinguishability, one can compare distributions over strings , while the 
context distance can evaluate how far terms of arbitrary types are. 

The discrepancy Point |Tj puts in evidence, however, can be easily overcome by turning the context 
distance into something slightly more parametric. 

Definition 10 (Parametric Context Equivalence) Given two terms t, s such that (- t, s : 
aStr —> A, we say that t and s are parametrically context equivalent iff for every context C 
such that C\\ — A] : Str we have that |[[C[fl n ]](e) — [C'[sl n ]](e)| is negligible in n. 

2 A negligible function is a function which tends to 0 faster than any inverse polynomial (see [8] for more details). 
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This way, we have obtained a characterization of Cl: 

Theorem 7 Let t , s be two terms of type aStr —> Str. Then t, s are parametric context equivalent 
iff the distribution ensembles {[[tl n ]} nG N and {[sl n ]]}neN ane computationally indistinguishable. 

Please observe that Theorem [7] only deals with terms of type aStr —> Str. The significance of 
parametric context equivalence when instantiated to terms of type aStr —» A, where A is a higher- 
order type, will be discussed in Section HOI below. 


5.1 Computational Indistinguishability and Traces 

How could traces capture the peculiar way parametric context equivalence treats the security 
parameter? First of all, observe that, in Definition 1 101 the security parameter is passed to the term 
being tested without any intervention from the context. The most important difference, however, 
is that contexts are objects which test families of terms rather than terms. As a consequence, the 
action view(-) does not take strings or finite sets of strings as arguments (as in equivalences or 
metrics), but rather distinguishes : namely closed RSLR terms of type aStr —» Str that we denote 
with the metavariable D. The probability that a term t of type Str satisfies one such action view(D) 
is EmMOzO ' [Dm](e). 

A trace T is said to be parametrically compatible for a type aStr —► A if it is compatible for A. 
This is the starting point for the following definition: 

Definition 11 Two terms t,s : A are parametrically trace equivalent, we write t s, iff for 
every trace T which is parametrically compatible with A, there is a negligible function negl : N —* 
R[ 0j i] such that |Pr(t, pass(l n ) • T) — Pr(s, pass(l n ) ■ T)| < negl{ n). 

The fact that parametric trace equivalence and parametric context equivalence are strongly related 
is quite intuitive: they are obtained by altering in a very similar way two notions which are already 
known to coincide (by Theorem [6]). Indeed: 

Theorem 8 Parametric trace equivalence and parametric context equivalence coincide. 

The first inclusion is trivial, indeed every trace can be easily emulated by a context. The other 
one, as usual is more difficult, and requires a careful analysis of the behavior of terms depending 
on parameter, when put in a context. Overall, however, the structure of the proof is similar to the 
one we presented in Section [XT] The first step towards the proof is the introduction of a particular 
class of distinguishers D M such that: 


[D™m'](e) 


1 if to' = m 
0 otherwise 


We formalize the use of a distinguisher as argument of the action view(-) by giving the rules in 
Figure [9] 

In order to prove that parametric trace equivalence and parametric context equivalence coincide, 
we have to do some improvements to our approach: differently from Section [3.II we will show that 
if t, s : aStr —* A are parametrically trace equivalent, then for all context Acc.C'[|— A] : bStr —► B 
then A x.C[t,x] A x.C[sx]. This change is made because it is essential that the context passes 
the right security parameter to the term which it is testing; furthermore we will adapt the prove 
starting from a couple (A x.C,7') where T = {T n } n eN is a parametric term distribution, i.e. a 
family of term distributions of the form T n = |(fil n ) Pi }. 

The idea behind the prove is that starting from {(Xx.C, T) 1 }, {(Ax.C, S) 1 }, after a sequence of 
internal/external reduction performed by the context and the environment, the first reduction 
inside the hole is the pass of the security parameter l n which in our new setting coincide to the 
choice of T n e T, <S n e S according to l n ; at this point, if we prove that the two pair distribution 
are d— related, by the non-expansiveness we will get the thesis. 
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[Dm](e) = p 


T = { ( mi ) Pi } [Dm,] (e) = p[ 


(jn,T) -C ew W p ([•], T) ^ vie ™C D >»- ZPi-Pi 

Som (C, T) —* v ' ew ( D Om_) = pQ Y,lrn(C, T) = P1 (, 0,T ) ) p £ 

(case A (C, t 0 ,ti,te),T) — {{to, T)P0 , (ti , T) p i , (h , T)^} 

(C, T) - view ( D ^ PlJL 

{((tom.)( r e c A (n,to,ti,t ! ,)),T) !> -^-}m=Or Si + 

(reC A (C, to, tl, t e ), T) —► {((tl 2 n)(rec A (n,t 0 ,ti,t,,)),T) p ^}_m = ln + 

{(t«>T) p £} 


(C,T) => s {(C'i,7i) Pi } (Cj,7i) 


^view(D) „/ 


(c, r) 


»(D) 


Spi Pi 


Figure 9: Distinguisher 1-step and Small-step rules 

Lemma 14 Given T = {T n },*S = {<S n }, with T n = {(fl 11 ) 1 }, <S n = {(si 11 ) 1 }, if t s then for 
all n 6 N there exists e : N —* R negligible, such that £ T (T n ,<S n ) < e(n) 

Proof. If t s, then 3e such that | Pr(f, pass(l n ) ■ T) — Pr(s, pass(l n ) • T)| d ^ e(n). So we have 
that: 


Pr(tl n , T) — Pr(sl n , T)| =| PrQ^l") 1 }, T) — Pr({(sl n ) 1 }, T)| = 

| Pr(T n , T) - Pr(S n , T)| = S T (T n ,S n ) < e(n) 


□ 

Theorem 9 (Parametric Congruence) Given two terms t, s : aStr —> A such that t s, then 
for all context A x.C with \- A x.C\\- A] : B we have: A x.C[tx] Ax.Cfsx]. 

Proof. Our goal is to prove that for all traces T parametrically compatible with B we have that 
there exists e : N —> K negligible such that: 

| Pr(Ax.C[tx], pass(l n ) ■ T) — Pr(Ax.C[sx], pass(l n ) • T)| ^ e(n) 

We can see the terms inside the hole as parametric term distributions T = {(tl n ) 1 } n eN, S = 
{(sl n ) 1 } ne rj; so if we start from the pair distributions {(Ax.C, T) 1 }, {(Ax.(7,5) 1 } we have that the 
first reduction step is external, indeed the environment passes the value l n , so we get: 

{(Ax.C,T) 1 } - pass(1 "> {(CF/sKT) 1 } {(Ax.C,5) 1 } - pass ( in ) {(CjiV,}^) 1 } 

At this point we can suppose that the context reduces internally and externally (depending on 
its type) so we split the trace T in Ti • T 2 , where Ti is the trace performed by the context; 
actually the fact is that it reduces in the same way for both pair distributions so, we can say that 

{(crAhf) 1 } ^ Ti {(c i ,f)^},{(c{i7x},.s)} =» T M(c i) 5) w }. 

Now the only possible reduction is a term distribution reduction, i.e. a reduction inside the hole, 
but this means a choice of a term distribution inside the family depending on n; so we get: 

m,r)n - {(Cj, {(fi") 1 })^} {(Cj,<s) pi } - {(c i ,{(si n ) i }) p< } 

But t s, so by the previous lemma we have that there exists e : N —► R negligible such that 
^({(tl 11 ) 1 }, {(si 11 ) 1 } < e(n); furthermore it is obvious that {(Cj, {(tl n ) 1 }) Pi } 17 d{(Ci, {(sl 11 ) 1 })^} 
with d ^ e(n) and by applying the Lemma ITTl we have that for all traces T 2 : 

| Pr({(Cj, {(il”) 1 })^}, T 2 ) - Pr({(Cj, {(sl") 1 ^}, T 2 )| < d < e(n) 
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and this means: 


| Pr(A x.C[tx], pass(l n ) ■ T) — Pr(Ax.C'[sa:], pass(l n ) • T)| < £(n) 

and then the thesis. □ 

Corollary 3 Given two terms t , s : aStr —> A, if they are parametrically trace equivalent, then 
they are parametrically context equivalent. 

Proof. For all context F C[A] : Str we have: 

|[C[t in ]](e) - [C[sl n ]](e)| = |[(Aa;.C[te])l n I(e) - [(Ax.C'[sz;])l I1 J(e)| = 

j Pr(Ax.C[tx], pass(l n ) • view(D e )) — Pr(Ao;.C[s:r], pass(l n ) • view(D e ))| ^ e(n) 

where e : N —* R. is a negligible function. □ 

5.2 An Example 

We propose an example in which we analyze two different programs. Both of them are functions 
of type nStr —> nStr —> Str: the first one returns the string received in input padded or cut 
depending on its length and on the security parameter , the second one produces a random string 
and compare it to the input (padded or cut). If the comparison is negative it returns the input 
string (padded or cut), otherwise it returns the opposite. We use some syntactic sugar in oder to 
make the terms more understandable. 

t :=Xsec.Xx.LV x sec 

s :=Asec.Ax.if (LV x sec) = (RBG sec) then -^(LV x sec) else (LV x sec) 

The function LV receives in input two strings and pads or cuts the first one in order to return a 
string of the same length as the second one received in input, the function RBG returns a random 
string of the length of the one received in input and the function —■ switches all the bits of the 
string in input. So, for all n e N, to e V Str , if we set to' = LV to l n we have: 

t =v.P ass ( in )'P ass 0rd {(to') 1 } 

s =^>P ass ( in )'P ass (—) {(to') 1 - (-to'H 

Where p = Pr[ to' = RBG l n ] = A-. So for all distinguislrer D, if we set [Dm'] = pi, [D(-'to')] = P 2 
we have that: 


l M .pass(l n ) pass(m) view(D) 

S i —y pass( l n ) ■ P ass (zn) -view(D) (l-p).p 1+p .p 1i 

And so we have t, s are parametric trace equivalent, indeed for all n, to, D we have: 

| Pr(t, pass(l n ) • pass(TO) • view(D)) — Pr(s, pass(l n ) • pass(;ro) • view(D))| = 

\Pl - ((! -p)-Pl + P-P2)\ = \pi -Pi + P-P1 - P-P2\ =P - \pi ~P2\ ^P = ^7 

which is negligible. This, in particular, implies that the two terms are parametrically trace equiv¬ 
alent, thus parametrically context equivalent. 

5.3 Higher-Order Computational Indistinguishability? 

Theorem[7]and Theorem[5]together tell us that two terms t, s of type aStr —> Str are parametrically 
trace equivalent iff the distributions they denote are computationally indistinguishable. But what 
happens if the type of the two terms t, s is in the form aStr —» A where A is an higher-order type? 
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What do we obtain? Actually, the literature on cryptography does not offer a precise definition of 
“higher-order” computational indistinguishability, so a formal comparison with parametric context 
equivalence is not possible, yet. 

Apparently, linear contexts do not capture equivalences as traditionally employed in cryptog¬ 
raphy, already when A is the first-order type aStr —> Str. A central concept in cryptography, 
indeed, is pseudorandomness, which can be spelled out for strings, giving rise to the concept of a 
pseudorandom generator, but also for functions, giving rise to pseudorandom functions [13| . For¬ 
mally, a function F : {0,1}* —» {0,1}* —» {0,1}* is said to be a pseudorandom function iff F(s) 
is a function which is indistinguishable from a random function from {0,1}" to {0,1}" whenever 
s is drawn at random from n-bit strings. Indistinguishability, again, is defined in terms of PPT 
algorithms having oracle access to F(s). Now, having access to an oracle for a function is of course 
different than having linear access to it. Indeed, building a linear pseudorandom function is very 
easy: G(s) is defined to be the function which returns s independently on the value of its input. 
G is of course not pseudorandom in the classical sense, since testing the function multiple times 
a distinguisher immediately sees the difference with a truly random function. On the other hand, 
the RSLR term to implementing the function G above is such that Xx.tcs is trace equivalent to a 
term r where: 

• s is a term which produces in output |a;| bits drawn at random; 

• r is the term A x.q of type aStr —> bStr —* Str such that q returns a random function from 
a;|-bitstrings to |a;|-bitstrings. Strictly speaking, r cannot be an RSLR term, but it can anyway 

be used as an idealized construction. 

But this is not the end of the story. Sometime, enforcing linear access to primitives is necessary. 
Consider, as an example, the two terms 

t = Xn.{Xk.Xx.Xy.Enc{x,k))Gen[n) s = Xn.{Xk.Xx.Xy.Enc{y,k))Gen(n) 

where Enc is meant to be an encryption function and Gen is a function generating a random 
key. t and s hould be considered equivalent whenever Enc is a secure cryptoscheme. But if Enc 
is secure against passive attacks (but not against active attacks), the two terms can possibly be 
distinguished with high probability if copying is available. The two terms can indeed be proved 
to be parametrically context equivalent if Enc is the cryptoschenre induced by a pseudorandom 
generator. 

Summing up, parametrized context equivalence coincides with Cl when instantiated on base 
types, has some interest also on higher-order types, but is different from the kind of equivalences 
cryptographers use when dealing with higher-order objects (e.g. when defining pseudorandom 
functions). This discrepancy is mainly due to the linearity of the contexts we consider here. It 
seems however very hard to overcome it by just considering arbitrary nonlinear contexts instead 
of linear ones. Indeed, it would be hard to encode any arbitrary PPT distinguisher accessing an 
oracle by an RSLR context: those adversaries are only required to be PPT for oracles implementing 
certain kinds of functions (e.g. n-bits to ra-bits, as in the case of pseudorandomness), while filling a 
RSLR context with any PPT algorithm is guaranteed to result in a PPT algorithm. This is anyway 
a very interesting problem, which is outside the scope of this paper, and that we are currently 
investigating in the context of a different, more expressive, probabilistic A-calculus. 

6 Conclusions 

In this paper, we have studied notions of equivalence and metrics in a language for higher-order 
probabilistic polytime computation. More specifically, we have shown that the discriminating 
power of linear contexts can be captured by traces, both when equivalences and metrics are 
considered. Finally, we gave evidence on how applicative bisimilarity is a sound, but not fully 
abstract, methodology for context equivalence. 

We believe, however, that the main contribution of this work is the new light it sheds on the 
relations between computational indistinguishability, linear contexts and traces. In particular, this 
approach, which is implicitly used in the literature on the subject im he is shown to have some 
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limitations, but also to suggest a notion of higher-order indistinguishability which could possibly 
be an object of study in itself. This is indeed the main direction for future work we foresee. 
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